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ABSTRACT 

Galaxy proto-clusters at z > 2 provide a direct probe of the rapid mass assembly and galaxy gro'wth of present 
day massive clusters. Because of the need of precise galaxy redshifts for density mapping and the prevalence 
of star formation before quenching, nearly all the proto-clusters kno'wn to date 'were confirmed by spectroscopy 
of galaxies 'with strong emission lines. Therefore, large emission-line galaxy surveys provide an efficient 'way 
to identify proto-clusters directly. Here 'we report the discovery of a large-scale structure at z = 2.44 in the 
HETDEX Pilot Survey. On a scale of a fe'w tens of Mpc comoving, this structure sho'ws a complex overden¬ 
sity of Lya emitters (LAE), 'which coincides 'with broad-band selected galaxies in the COSMOS/UltraVISTA 
photometric and zCOSMOS spectroscopic catalogs, as 'well as overdensities of intergalactic gas revealed in the 
Lya absorption maps of Lee et al. (2014). We construct mock LAE catalogs to predict the cosmic evolution 
of this structure. We find that such an overdensity should have already broken a'way from the Hubble flo'w, and 
part of the structure 'will collapse to form a galaxy cluster 'with ^ Mq by z = 0. The structure contains 

a higher median stellar mass of broad-band selected galaxies, a boost of extended Lya nebulae, and a marginal 
excess of active galactic nuclei relative to the field, supporting a scenario of accelerated galaxy evolution in 
cluster progenitors. Based on the correlation bet'ween galaxy overdensity and the z = 0 descendant halo mass 
calibrated in the simulation, 'we predict that several hundred 1.9 < z < 3.5 proto-clusters 'with z = 0 mass of 
> 10^"^'* Mq 'will be discovered in the 8.5 Gpc^ of space surveyed by the Hobby Eberly Telescope Dark Energy 
Experiment. 

Subject headings: cosmology: observations - galaxies: clusters: general - galaxies: evolution - galaxies: 
high-redshift 


1. INTRODUCTION 

Galaxy proto-clusters at z > 2 are the “crime scene” of the 
rapid mass assembly and galaxy gro'wth of present day mas¬ 
sive clusters. During this epoch, the most massive dark matter 
(DM) halos in cluster progenitors are just about to cross the 
characteristic mass scale of 10^^ Mq (Chiang et al. 2013b; 
Wu et al. 2013), coinciding 'with the increasing dominance of 
various intra-cluster processes seen in fully formed clusters. 
The total star formation rate (SFR) of a z > 2 proto-cluster is 
predicted to be ~ 3 orders of magnitude higher than that of 
its z = 0 descendant (Behroozi et al. 2013), implying a rapid 
build-up of the stellar content in line 'with an emerging quies¬ 
cent galaxy population. Efficient baryon accretion of massive 
galaxies via cold streams from the gaseous cosmic 'web might 
be s'witching to an inefficient mode due to a uniformly shock- 
heated medium. Such a transition is expected to take place 
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in the largest halos first, i.e., in cluster progenitors during this 
epoch (Keres et al. 2005; Dekel & Birnboim 2006; Dekel et al. 
2009b). The subsequent virialization on both galaxy and clus¬ 
ter scales in about a dynamical timescale largely erases the 
signatures of the aforementioned processes, placing a funda¬ 
mental limit on inferences based on the largely archaeological 
record of cluster formation based upon near-field studies. Di¬ 
rect studies of cluster progenitors thus provide irreplaceable 
probes to understand the formation of present day massive 
clusters. 

The search for high-redshift cluster progenitors is challeng¬ 
ing due to their lack of mature cluster signatures such as ex¬ 
tended X-ray emission (Fassbender et al. 2011), the Sunyaev- 
Zel’dovich effect (Bleem et al. 2015), and the prominent 
galaxy red sequence (Gladders & Yee 2005; Gilbank et al. 
2011). The fundamental picture of gravitational structure 
formation implies that the most massive collapsed objects 
evolved from the densest regions in the early universe on 
a large scale (Kravtsov & Borgani 2012, and references 
therein). The finding of proto-clusters requires identifying 
galaxy overdensities in three-dimensions using precise red- 
shift measurements (Chiang et al. 2013b). 

Active star formation in cluster progenitors implies that (at 
least for the purpose of proto-cluster search and identifica¬ 
tion) more focus should be placed on star-forming galaxies 
instead of the quiescent ones that play a dominant role in tra¬ 
ditional cluster studies. The difficulty in mapping the high- 
redshift cosmic density field is alleviated by the prevalence of 
emission lines in these star-forming galaxies, for 'which spec¬ 
troscopic redshift can be obtained once the line transition is 
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identified. Therefore, nearly all the ~ 25 proto-clusters known 
to date (see the recent compilation in Chiang et al. 2013b) 
were found and/or confirmed spectroscopically by overdensi¬ 
ties of galaxies with strong emission lines, particularly Lya 
redshifted into the optical window (Steidel et al. 1998, 2000, 
2005; Kurk et al. 2000, 2004; Pentericci et al. 2000, 2002; 
Venemans et al. 2002, 2004, 2005, 2007; Shimasaku et al. 
2003; Palunas et al. 2004; Matsuda et al. 2005; Ouchi et al. 
2005; Prescott et al. 2008; Kuiper et al. 2011; Yamada et al. 
2012b; Cucciati et al. 2014; Lee et al. 2014b; Lemaux et al. 
2014; Saito et al. 2015). Alternatively, Ha emitters are also 
used as density tracers (Hatch et al. 2011; Matsuda et al. 2011; 
Hayashi et al. 2012; Koyama et al. 2013). 

Massive proto-clusters at z > 2, although having a much 
larger radius of influence compared with clusters in the lo¬ 
cal universe, occupy only ~ 1/1000 of the cosmic volume 
(Chiang et al. 2013b). Their abundance, by definition, is as 
low as that of galaxy clusters at z = 0. An effective survey of 
proto-clusters thus needs to probe an extremely large volume. 
Traditional multi-object slit spectroscopy, although providing 
reliable redshifts and galaxy spectral diagnostics, is expensive 
as a survey tool of this scale. Narrow-band imaging (with a 
larger redshift uncertainty than direct spectroscopy) has been 
successful in finding overdensities of Lya emitters (LAE) in 
both blank fields (Ouchi et al. 2005) and targeted fields around 
powerful radio galaxies (see a summary in Venemans et al. 
2007). This technique also revealed a puzzling but fascinat¬ 
ing population of diffuse Lya halos, the so called Lya “blobs” 
in overdense regions (Steidel et al. 2000; Prescott et al. 2008; 
Matsuda et al. 2009; Yang et al. 2009; Erb et al. 2011; Mat¬ 
suda et al. 2012). However, narrow-band imaging typically 
requires a region of interest with a known redshift; if used as 
a survey tool, it probes only a small volume in a thin redshift 
slice of Az ~ 0.1. 

Blind spectroscopy provides an opportunity to largely in¬ 
crease the survey volume. Eor instance, wide-held slit¬ 
less grism or prism spectroscopy (e.g., the baseline redshift 
surveys of the future Wide Field Infrared Survey Telescope 
(WEIRST; Spergel et al. 2013, 2015) and the Euclid mission 
(Laureijs et al. 2011)) is particularly suitable for the searches 
of proto-clusters traced by bright emission-line galaxies. In¬ 
tegral held unit (lEU) spectroscopy has even greater poten¬ 
tial, with no trade-off between spectral resolution and the sur¬ 
vey depth due to spectral crowding and confusion (compared 
to grism surveys). Eor the same reason of source crowd¬ 
ing, blind grism spectroscopy strongly demands space-based 
spatial resolution, while the lEU technique is feasible with 
ground-based facilities. However, early IFU techniques have 
focused on achieving sub-arcsecond sampling in a relatively 
small held of view (e.g., Eisenhauer et al. 2003; Larkin et al. 
2006; Bacon et al. 2010, 2015), making them less suitable for 
proto-cluster searches. 

The Hobby Eberly Telescope Dark Energy Experiment 
(HETDEX; Hill et al. 2008b) is pioneering the instrumen¬ 
tation development and observations of high-redshift large- 
scale structures using wide-held lEUs. In a 3 year base¬ 
line starting from late 2015, HETDEX will leverage the cos¬ 
mic evolution of the dark energy equation of state with high- 
redshift (z > 2) constraints imprinted by the Baryon Acous¬ 
tic Oscillations (BAO; Eisenstein 2005) in the early universe. 
The program will perform a redshift survey of LAEs in 300 
deg^ (Spring held) plus 150 deg^ (Eall held) at 1.9 < z < 3.5 
(with a hlling factor of 1/4.5), with a total survey volume of 


~ 8.5 Gpc^. The survey uses the 10-m Hobby-Eberly Tele¬ 
scope (HET; Ramsey et al. 1998) with a wide-held upgrade 
to reach a 22 x 22 arcmin^ held of view. Blind spectroscopy 
(R ~ 750 in 350-550 nm) with no pre-selection of targets will 
be performed using the Visible Integral-held Replicable Unit 
Spectrograph (VIRUS; Hill et al. 2012, 2014). With the LAE 
redshifts, HETDEX will pinpoint numerous locations of the 
highest density concentrations at 1.9 < z < 3.5, generating a 
substantially large and homogeneous sample of cluster pro¬ 
genitors in the key epoch of cluster formation before virial- 
ization. 

As a proof of concept, the HETDEX Pilot Survey (HPS; 
Adams et al. 2011) performed blind spectroscopy over a 169 
arcmin^ area (divided into four sub-helds) for bright LAEs at 
1.9 < z < 3.8, which corresponds to a volume of ~ 10® Mpc^. 
A total of 105 LAEs were discovered and studied in detail 
(Adams et al. 2011; Blanc et al. 2011; Einkelstein et al. 2011; 
Chonis et al. 2013; Hagen et al. 2014; Song et al. 2014). 

Among the LAEs discovered in HPS, there is a concen¬ 
tration of nine LAEs across a 71.6 arcmin^ region in the 
HPS-COSMOS held, which lie in a narrow redshift range at 
Z ~ 2.44 (LAE overdensity of > 4 in a comoving volume of 
~ 10 X 10 X 35 Mpc^ h ^). Here we present a detailed charac¬ 
terization of this structure using HPS data, supplemented with 
a publicly available catalog of continuum-selected galaxies 
with photometric redshifts from COSMOS/UltraVISTA. We 
use a cosmological simulation to model the realistic connec¬ 
tion between LAEs and the underlying matter held and the 
complex nonlinear gravitational structure formation across 
cosmic history. Our study shows that part of this structure 
will collapse to form a galaxy cluster with Mq by 

Z = 0. The structure (together with another similar overdensity 
partially covered by HPS) hosts several extended Lya halos, 
some of which are identihed as active galactic nucleus (AGN) 
in the X-ray. These systems are commonly found in overdense 
regions at high redshift, perhaps indicating an accelerated co¬ 
evolution of massive galaxies and their supermassive black 
holes in overdense environments. 

In §2, we describe our LAEs and continuum-selected 
galaxy sample, and we construct a suite of mock LAE cata¬ 
logs with clustering properties bracketing that of the observed 
LAEs. In § 3, we present the spatial distributions of galax¬ 
ies in HPS-COSMOS along the line of sight and on the pro¬ 
jected sky. In § 4, we place this structure in the context of 
cosmic structure formation based on the cosmological simu¬ 
lation connected through the mocks. In § 5, we demonstrate 
a significant enhancement of diffuse Lya halos and AGN in 
this structure. In § 6, we present the outlook for proto-cluster 
identification in the HETDEX survey. We discuss the results 
in §7 and conclude this work in §8. Cosmological param¬ 
eters based on the 7 year Wilkinson Microwave Anisotropy 
Probe (Komatsu et al. 2011) are adopted: {h, Ha, Hj, 
erg] = [0.704, 0.272, 0.728, 0.967, 0.81]. All magnitudes 
given are in the AB system (Oke & Gunn 1983). 

2. GALAXY SAMPLES AND SIMULATIONS 

Cluster formation is directly driven by the evolution of the 
matter density field under gravitational processes. However, 
DM, being the dominant component of the matter density, 
has no direct electromagnetic signature. We follow the stan¬ 
dard formalism using galaxies as (in general, biased) trac¬ 
ers of the underlying density field. Here we describe our 
HPS LAE sample and the COSMOS/Ultra VISTA catalog of 
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continuum-selected (“photo-z”) galaxies. We generate mock 
LAEs matched in bias and stellar mass to bridge the gaps be¬ 
tween luminous and DM, and also connect observations in a 
fixed lightcone to cosmological simulations that model their 
time evolution. 

2.1. Lya Emitters: The HETDEX Pilot Survey 

In this work, we use mainly the LAE sample in HPS- 
COSMOS, the largest contiguous HPS sub-field of 71.6 
arcmin^ (~ 1' x 10') near the center of the COSMOS field. 
The sample contains a total of 52 LAEs at 1.9 < z < 3.8, 
with four showing X-ray emission (matched with the cat¬ 
alog of Elvis et al. 2009)^ The field of HPS-COSMOS 
partially overlaps with several deep surveys that cover the 
redshift of > 2 including CANDLES (PI: Eaber, Eergu- 
son), VVDSWUDS (PI: Le Eevre), zCOSMOS (PI: Lilly), 
ZEOURGE (PI: Labbe), 3D-HST (PI: van Dokkum), and a 
pilot survey of CLAMATO (PI: Lee) for Lya forest tomogra¬ 
phy. We will refer to some of the findings from these surveys 
when relevant. 

HPS (Adams et al. 2011) is a blind spectroscopic survey of 
emission-line galaxies using the Mitchell Spectrograph, for¬ 
merly called the VIRUS-P spectrograph (the VIRUS proto¬ 
type; Hill et al. 2008a) on the 2.7 m Harlan J. Smith telescope 
at McDonald Observatory. A single Mitchell Spectrograph 
pointing covers an area of 1.7' x 1.7' with a 1/3 filling fac¬ 
tor using an array of 246 fibers of each 4".235 in diameter. 
With a 6-dither pattern, HPS reaches a complete coverage in 
the field and sub-fiber-size spatial sampling. The survey con¬ 
tains four sub-fields in COSMOS (71.6 arcmin^), GOODS- 
N (35.5 arcmin^), MUNICS (49.9 arcmin^), and XMM-LSS 
(12.3 arcmin^) that are rich in ancillary multi-wavelength 
data, with a total survey area of 169 arcmin^. The spectra 
cover a bandpass of 3500-5800 A with a spectral EWHM of 
5 A {uinst ~ 130 km s“^ at 5000 A). The survey probes LAEs at 
1.9 < z < 3.8 with a single line expected within the bandpass 
in a total effective volume of ~ 10^ Mpc^. Each line detection 
is matched with a continuum counterpart or an upper limit is 
determined if undetected. LAEs are then distinguished from 
lower redshift galaxies with a single line detection (mainly 
unresolved [O II]A3727, 3729 emitters at 0.19 < z < 0.56) by 
an equivalent width (EW) criterion, where objects with a rest- 
frame EWLyce > 20 A are classified as LAEs. The contami¬ 
nation rate is estimated to be 4%-10%. A total of 105 LAEs 
are discovered in HPS down to a ELy^ limit of ~ 4 x lO'^^ 
erg s“' (roughly constant across the redshift range). Six of the 
105 LAEs have X-ray counterparts, indicating the presence of 
AGNs in ~ 5% of the sample. Among the nine LAEs in the 
large-scale structure at z = 2.44 (see § 3 and Table 1), four are 
covered by 3D-HST. The HPS LAE identifications for these 
four sources are all confirmed by at least one 3D-HST metal 
line detection. An additional two LAEs (and also the four cov¬ 
ered by 3D-HST) in the z = 2.44 structure are followed up and 
confirmed using Magellan/IMACS spectroscopy with a spec¬ 
tral resolution of 150 km s“^ EWHM, revealing unique asym¬ 
metric line profiles expected for Lya, and excluding the pos- 

^ In this work we do not exclude AGNs from the LAE sample since all the 
Lyo: emitting objects provide reliable redshifts to trace the underlying cosmic 
density field. We model the clustering properties of the full LAE population 
in § 2.3. This treatment is favored for future applications of the full HETDEX 
survey, in which no coordinated deep X-ray observations are planned to cover 
a significant fraction of the wide HETDEX field. 


sibilities of being foreground [O II] emitters of A3727, 3729 
doublet (Chonis et al. in prep.). 

Hagen et al. (2014) estimated the stellar mass of 63 out 
of the total 74 LAEs in the HPS GOODS-N and COSMOS 
fields by spectral energy distribution (SED) fitting of indi¬ 
vidual galaxies, finding a wide distribution of log(M*/M0) 
spanning from ~ 7.5 to ^ 10.5. 

Adams et al. (2011) performed a curve of growth analysis 
to obtain robust total Lya fluxes for the HPS LAEs and esti¬ 
mated the spatial extent of Lya emission. As the survey uses 
4".235 diameter fibers with a dither pattern to achieve a dis¬ 
crete sampling of < 3" nearest fiber-center distances, no con¬ 
straint below the scale of few arcsecond is obtained. Nonethe¬ 
less, sources with an apparent spatial EWHM > 6".81 (in¬ 
cluding the effects of instrument, sampling, and seeing) can 
be ruled out as point sources with a confidence level of 
99.7%^®. Using this criterion, there are a total of 7 (10) 
extended Lya halos in HPS-COSMOS (full HPS). Table 1 
presents the catalog for a selected subset of LAEs of interest 
in the HPS-COSMOS. 

2.2. Continuum-selected Photo-z Galaxies: The 
COSMOS/UltraVISTA 

We supplement the LAEs with continuum-selected galax¬ 
ies with photometric redshifts (photo-z). Although their red¬ 
shift uncertainty is considerably larger than that of the LAEs, 
these objects provide a more mass complete sample and over a 
wider field. We use a publicly available band selected pho¬ 
tometric redshift galaxy catalog of Muzzin et al. (2013) in the 
1.62 deg^ COSMOS/UltraVISTA survey. The catalog com¬ 
bines photometric datasets from UltraVISTA (McCracken 
et al. 2012) for near-IR, Subaru/SuprimeCam (Taniguchi et al. 
2007) and CEHT/MegaCam (Capak et al. 2007) for optical. 
Information from the GALEX EUV and NUV (Martin et al. 
2005) and Spitzer IRACh-MIPS mid-IR data (Sanders et al. 
2007) are included. The photo-z error of galaxies at 2 < z < 3 
is, on average, at a level of a^/{\ -l-z) = 2.5-3%. Here we use 
the sample above the 90% completeness limit of < 23.4 
mag, excluding a small fraction (~ 4%) of galaxies showing 
a broad and/or multi-modal redshift probability distribution. 
A subsample of < 22.0 galaxies will be referred to as the 
“bright” sample. 

We will also use the galaxy stellar masses provided in 
Muzzin et al. (2013), derived by SED fitting with the EAST 
code (Kriek et al. 2009) using a set of population synthesis 
models from Bruzual & Chariot (2003). Solar metallicity, a 
Chabrier (2003) initial mass function (IME), and a Calzetti 
et al. (2000) dust extinction law are assumed. The uncertainty 
in stellar mass is ~ 0.2 dex. 

2.3. Bias and Mass Matched Catalogs of Mock LAE 

ACDM cosmological V-body simulations and semi- 
analytic models (SAM) of galaxy formation provide a frame¬ 
work to model the complex hierarchical growth of DM and 
galaxies in three-dimensions on the relevant scales, and link 
the evolution of large-scale structures across cosmic time. To 
characterize the LAE density concentration in HPS-COSMOS 
at z = 2.44 (see § 3), we generate a set of mock catalogs of 

The scale of ^ 7" coincides with the sum of the fiber size and the 
average sampling separation. A source of ~ 1 " would be detected (for each 
at least about half fiber area is filled) by 10-12 fibers, while a source of 6" 
would be detected by only 4-6 fibers (see Figure 1 in Adams et al. 2011). 
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Table 1 

HPS-COSMOS Lya Emitter Catalog (Selected) 


HPS 

z'* 

a 

<5 

Flux 

L 

Spectral 

Spatial 

Counter¬ 

Counter¬ 

EWrest" 

Fluxx-ray 

Index 

(Lyo) 

(J2000) 

(J2000) 

(Lya) 

(Lya) 

FWHM® 

FWHM" 

part roR 

part Prob.^^ 

(Lya) 

(0.5-10 keV) 



[deg] 

[deg] 

[10“'^ cgs] 

[lO”*® cgs] 

[km s“*] 

[arcsec] 

[mag] 


[A] 

[10-*® cgs] 






z = 

2.44 structure 






160 

2.4346 

150.03587 

2.29406 

17 1+10.5 
^ '-^-6.4 

8.3l:l 

663 

5'2l:i 

27.35 

0.61 

1034.3lo®®o® 


162 

2.4284 

150.03637 

2.25889 

76.4;|jj 

37.0l:‘ 

1063 

8.3)':® 

24.45 

0.20 

564.3)1“:® 

370±67 

164 

2.4518 

150.03729 

2.28978 

25.4!|i:^ 

12.61:« 

482 

11.01:® 

24.32 

0.31 

126.4)™:® 


182 

2.4337 

150.05137 

2.23778 

25.6!5;8 

i2'5l:? 

211 

4 q+0.5 
^•^-0.8 

25.04 

0.60 

180.8)®®:® 


189 

2.4515 

150.05462 

2.31564 

12-91:? 

6.41:® 

509 

511:9 

24.99 

0.64 

85'3l?:? 


197 

2.4419 

150.06121 

2.29650 

i+^l:' 

8.7i:® 

536 

4.ii:i 

25.8 

0.33 

258.911:® 


263 

2.4323 

150.12108 

2.23589 

24.11'® 

11 7+®'® 

-3.7 

511 

5'8l:g 

24.17 

0.89 

66.3)1:® 


306 

2.4390 

150.16504 

2.22739 

7Q 0-1-5.8 

0 o. ^ 

18.7)1 

766 

+11:8 

24.07 

0.72 

90.4)1:® 


318 

2.4558 

150.18387 

2.26636 

30.31',1 

15.11® 

349 

8-0)!:? 

23.69 

0.32 

74 5+®®'® 

/h.J-27 6 







Other Extended LAEs or AGNs 
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2.1751 

150.02608 

2.21969 

84.0!li* 

oi a+5.5 
-3.0 

1164 

n c-l-0.8 

'•-^-o.s 

24.08 

0.51 

2380.9)1°®®:® 


148 

3.4176 

150.02917 

2.32439 

8.61:® 

9 5+2-2 

^•D-29 

289 

4 5-ki.2 

24.77 

0.43 

180.5)®®:* 

166±49 

222 

2.9430 

150.07600 

2.26417 

87.31:? 

67.5)1 

983 

+7)g:i 

23.55 

0.98 

278.1)®*:® 

268±60 

261 

2.0960 

150.11904 

2.29678 

143.71®:? 

48.4)1 

886 

8'3l:I 

23.76 

0.87 

536.7)‘5®1 

2040±125 


“ With an uncertainty of 4 X 10^ based on a 0.5 A line center uncertainty. 

*’ After deconvolution with a 5 A FWHM instrumental resolution (cr,„st ~ 130 km s“*). 

Including a tophat component of the fiber size of 4".235 and the effects of dither pattern and discrete sampling. 
Probability of counterpart association (i?-band). 

Based on an interpolation between the two nearest filters for continuum. 


LAEs at z ~ 2.4 with realistic clustering properties by post¬ 
processing the SAM of Guo et al. (2013) on top of a new 
run of the Millennium Run (MR) cosmological DM N-body 
simulation (Springel et al. 2005) with the WMAP7 cosmol¬ 
ogy (Komatsu et al. 2011). The Guo et al. (2013) model 
improves upon the extensively tested models of De Lucia & 
Blaizot (2007) and Guo et al. (2011). Various galaxy proper¬ 
ties are reasonably reproduced, and we particularly rely on its 
agreement with observations for galaxy clustering on large- 
scales in the “two-halo” regime (Guo & White 2009; Guo 
et al. 2011, 2013; Kang et al. 2012; Marulli et al. 2013; Chiang 
et al. 2014; Kang 2014; Pujol & Gaztanaga 2014; Skibba et al. 
2014). The galaxy stellar mass is 95% and 60% complete to 
10*^ Mq and 10^ sufficient for the LAE modeling here. 

We aim to match simultaneously the LAE number den¬ 
sity, the galaxy bias, and the stellar mass distribution to the 
observed sample. Correlation length analyses suggest that 
high-redshift LAEs are less clustered compared to broad-band 
selected Lyman-break galaxies (of a typical limiting magni¬ 
tude of K < 23), with an overall linear bias of 2.0 ± 0.6 at 
2 < z < 3 (Gawiser et al. 2007; Ouchi et al. 2008; Guaita 
et al. 2010; Ouchi et al. 2010; Bielby et al. 2015), and 2.5- 
4 at z ~ 4 (Kovac et al. 2007; Ouchi et al. 2008, 2010). 
Galaxy bias is known to correlate strongly with stellar mass 
(or color/bolometric luminosity, see Coil 2013, and references 
therein) but very weakly with Lya luminosity (Orsi et al. 
2008). Therefore the criterion to match the distribution in stel- 

* * The mass completeness is evaluated by comparing with the same galaxy 
model applied to the Millennium-II Simulation (Boylan-Kolchin et al. 2009) 
with a higher mass resolution. 


lar mass provides constraints on not only the effective galaxy 
bias of the entire population but also the scatter of the bias 
(i.e., cosmic variance of the galaxy bias). The effect of the 
latter cannot be neglected in the case of localized statistics in 
real space, which usually suffer from having a relatively small 
number of objects. Conversely, the effect is less important in 
global statistics of correlation function and power spectrum. 
A full theoretical modeling of Lya radiative processes and a 
detailed match in Lya luminosity function and EW distribu¬ 
tion are not required since these have negligible impact on the 
gravitational clustering of LAEs once the criteria in bias and 
stellar mass are met. 

To test the effects of clustering modeling on the final in¬ 
terpretation of the observed structure, we generate a suite of 
mock catalogs with four different galaxy bias and stellar mass 
distributions varied continuously to bracket that estimated for 
the observed LAEs. Lor each mock catalog, we artificially 
elevate the SLR of SAM galaxies, such that the same obser¬ 
vational selection criteria of Lya emission propagate to se¬ 
lecting model galaxies with different clustering via the pos¬ 
itive correlation of SLR versus M*, the “star-forming main 
sequence” (e.g., Reddy et al. 2012; Rodighiero et al. 2014). 
This treatment acknowledges the uncertainties and bypasses 
the issue that at z ~ 2, most of the state-of-the-art cosmolog¬ 
ical simulations (both hydrodynamical and SAM, including 
the one used here) generate a star-forming main sequence with 
a normalization 0.1-0.4 dex lower than that observed (Spea- 
gle et al. 2014, and references therein), and not sufficiently 
“bursty” across the full range of stellar mass'^ (see discus- 

The deficit of star-bursting objects in simulations across the star-forming 
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Table 2 

Summary of the Mock LAE Catalogs at z = 2.4 


Simulation 

Alog(SFR/M 0 yr-‘)‘' 

b'’ 

log(M*/M 0 )‘= 

Plae** 

Mock I 

1.0 

1.82 

8 . 68 !“:^? 

3% 

Mock II 

0.7 

2.00 

8.99fl!|i 

4% 

Mock III 

0.4 

2.22 

^•^0-0.44 

6 % 

Mock IV 

0.2 

2.44 

Q 4fi+^)-46 

9% 


“ The log(SFR/M 0 yr“*) offset applied to all galaxies in a given mock to 
compensate the systematically low and insufficiently bursty SFR of the SAM 
at this redshift. This offset is used as the sole control variable, which gener¬ 
ates mocks with different bias and stellar mass. 

*’ Galaxy bias calculated at 8 Mpc h“' comoving. 

Median and 16/84 percentiles of the stellar mass distribution. For com¬ 
parison, the HPS star-forming LAEs at 1.9 < z < 3.8 are estimated to have 
log{Mt/M q) = 8.74+[J *j (converted to Chabrier IMF and the cosmological 
parameters adopted here; Hagen et al. 2014). 

** Probability for star-forming galaxies to have the maximum values of 
(Equation (2)), or equivalently the Lya duty cycle, tuned to match the ob¬ 
served LAE number density. 

sions in Weinmann et al. 2012; Furlong et al. 2014; Genel 
et al. 2014; Mitchell et al. 2014; White et al. 2015). 

A brief outline of our LAE modeling is as follows. We 
first systematically “burst” the SFR of SAM galaxies on the 
star-forming sequence. We then model the intrinsic Lya pro¬ 
duction by galaxy instantaneous SFR, and the effects of dust 
attenuation by empirical constraints, effectively generating a 
broad distribution in galaxy M* that is consistent with what 
is found in observations. A high degree of stochasticity (a 
survival probability, equivalently a Lya duty cycle; see also 
Nagamine et al. 2010) is then adjusted by hand to match the 
observed HPS LAE number density in each mock catalog. Fi¬ 
nally, an evaluation of the two-point correlation function of 
the mocks is preformed, which serves as a check of the Lya 
modeling and of this approach as a tool to study large-scale 
structure. A summary of the suite of four mock LAE catalogs 
is given in Table 2. We describe the details of these proce¬ 
dures in the following. 

After applying the SFR offset, we first compute, for each 
SAM galaxy, the intrinsic Lya luminosity L’^y^ generated in 
star-forming HII regions using the empirical calibration for 
Ha (Kennicutt 1998) and assuming an intrinsic Lya to Ha 
ratio under Case B recombination (Brocklehurst 1971; Oster- 
brock & Ferland 2006). This gives 

= 1.98 X 1O42(SFR/M0yr-I)ergs-S (1) 

where the proportionality constant has been multiplied by a 
factor of 1.8 to convert from the Salpeter IMF (Salpeter 1955) 
to Chabrier IMF (Chabrier 2003) assumed in the SAM used 
here (both in a range of 0.1-100 Mq). 

Next, we implement dust attenuation of Lya photons in the 

sequence at z ~ 2 results in a situation that only massive objects (thus high 
SFR) with low dust content would reach a high Lyo: luminosity and EW. 
Such a population is significantly more massive than that observed. Thus 
we assume that the ranks in SFR for objects with a given stellar mass are 
statistically realistic in the simulation, but a large fraction of objects should 
have a higher absolute value of SFR. We will show later that by implementing 
a systematic SFR offset, other major galaxy properties of interest can be self- 
consistently reproduced and matched with that observed. This result indicates 
that the discrepancy in the normalization of the star-forming sequence is the 
sole fundamental problem at the level relevant to this work, which needs to 
be resolved in future SAMs. 


host galaxies. Due to the resonant nature of the transition, 
Lya photons could experience long scattering path-lengths 
in the neutral interstellar medium (ISM) of the host galax¬ 
ies. Thus a small amount of dust often produces a signifi¬ 
cant level of absorption. As a result, only several percent of 
the full star-forming galaxy population emit observable Lya 
emission (Hayes et al. 2010; Ciardullo et al. 2014). However, 
objects observed as LAEs show a level of dust attenuation in 
Lya roughly following that of the stellar continuum at 1216 
A (Finkelstein et al. 2009, 2011; Blanc et al. 2011; Nakajima 
et al. 2012; Hagen et al. 2014). Specifically, observed LAEs 
show a lower limit of Lya optical depth TLya that roughly 
equals to r^ie, the optical depth of stellar continuum at 1216 
A, causing an upper limit in Lya escape fraction, 

maxif^l ) = ( 2 ) 

assuming the extinction of the stellar continuum follows the 
Calzetti et al. (2000) law (ki 2 i 6 = 11.98). On the other hand, 
the observational criteria (mainly EW, but also intro¬ 

duce a selection effect such that galaxies with Tiy^ >> 
would not be observed as LAEs. Thus we set a Lya escape 
fraction at the maximum values with the given dust con¬ 
tent of a galaxy, selecting those passing the L^ya threshold 
of the HPS, and then drop a large fraction of galaxies ac¬ 
cording to a survival probability (independent of any galaxy 
properties) to match the observed LAE number density.'^ The 
dropped population corresponds to the large (> 90%) fraction 
of star-forming galaxies with TLya >> Tme, thus produces no 
observable Lya emission. 

We measure galaxy bias of the mocks by calculating the 
galaxy two-point correlation function and comparing it to that 
of the underlying DM at the same epoch. Multiple estima¬ 
tors (Peebles & Hauser 1974; Hewett 1982; Davis & Peebles 
1983; Hamilton 1993; Landy & Szalay 1993) are used; all 
give consistent results because of the large number (^ 5 x 10^) 
of LAEs per mock catalog. We Fourier transform the mat¬ 
ter power spectrum to obtain the matter two-point correlation 
function, where the power spectrum is calculated using the 
Cosmic Linear Anisotropy Solving System (CLASS) package 
(Lesgourgues 2011a; Bias et al. 2011; Lesgourgues 2011b). 
The linear galaxy bias is obtained using the standard defini¬ 
tion 

(3) 

U(r) 

at r = 8 Mpc h“^ comoving, where the ^gai and are the two- 
point correlation function of galaxies and matter, respectively. 
The galaxy bias of the set of our four mocks span a range 
from L8-2.4 (Table 2 and the top panel of Figure 1). This 
agrees well with that of the observed LAEs at roughly the 
same epoch (Gawiser et al. 2007; Ouchi et al. 2008, 2010; 
Guaita et al. 2010), where a small < 5% fraction of LAEs with 
X-ray detection have been excluded from these observational 
clustering analyses. X-ray AGN hosts are found to be more 
clustered (Allevato et al. 2011, 2014), and thus the inclusion 
of these objects as in this work, though subdominant, should 

A detailed matching of the simulated and observed Lya luminosity 
function and EW distribution requires relaxing our simplistic assumption for 
Tcya ■ clustering properties of LAEs are reproduced, how¬ 

ever, we do not perform a fine-tuning in the probability distribution of , 
which in principle might correlate with environment and multiple bulk and 
unresolved galaxy properties other than the total amount of dust. 
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Figure 1. The galaxy bias (top) and stellar mass distribution (bottom) of 
the mock LAE catalogs (color points/lines) compared with that derived from 
observations in the literature (black points/histogram). 


elevate the sample averaged galaxy bias slightly. 

The stellar mass distributions of our mock LAE catalogs are 
shown in the bottom panel of Figure 1 and summarized in Ta¬ 
ble 2. The hatched histogram indicates that of the observed 
non-AGN sample of HPS LAEs (z = 2.87;!;Q5g), showing a 
median log (M^/Mq) and 16/84 percentile scatter of 
(converted to Chabrier IMF adopted here; Hagen et al. 2014). 
The decline at both the low and high mass ends of the ob¬ 
served sample are physical: the low-end tail is caused by the 
declining SFR, thus is the intrinsic Lya production of low 
mass galaxies; the high-end tail originates from an increas¬ 
ing dust content of high-mass star-forming galaxies. Incom¬ 
pleteness near the detection limit in does not propagate 
to bias the stellar mass distribution because of the intrinsically 
poor correlation between M*. Our mock LAE cata¬ 

logs show similar M* distributions with that of the observed 
LAEs, particularly in the low median values of log (M*) and a 
similar wide spread, which is about twice large as that of the 
(more massive) Lyman break galaxies in the same epoch (e.g., 
Daddi et al. 2007). LAEs in general represent a heterogeneous 
population of objects with various levels of gravitational clus¬ 
tering, manifested in a large cosmic variance of the galaxy 
bias. 

Using the empirical modeling of Lya production and dust 
attenuation, our mock LAEs successfully reproduce the ob¬ 
served galaxy bias and stellar mass distribution simultane¬ 
ously. For both these properties, the Mock I appears to best 


match the observed star-forming LAEs. With a small fraction 
of AGN hosts included, we consider the Mock II ib = 2.0) as 
the fiducial mock of the observed galaxy tracers. With this set 
of mocks, we will discuss the fate of the large-scale structure 
at z = 2.44 in HPS-COSMOS, and its uncertainty given the 
uncertainty in the clustering properties of the galaxy tracers. 

3. LARGE-SCALE STRUCTURE AT Z = 2.44 

Here we present the large-scale galaxy concentration found 
at z = 2.44 using the sample of LAEs in HPS-COSMOS 
supplemented by continuum-selected galaxies with photo-z 
in COSMOS/UltraVISTA. The field of view of the HPS- 
COSMOS is of the same order as the characteristic angular 
size of proto-clusters predicted (Chiang et al. 2013b). How¬ 
ever, the survey probes an order of magnitude longer depth 
along the line of sight. 


3.1. Redshift Distribution 

In the 71.6 arcmin^ field of view of the HPS-COSMOS 
(outlined in Figure 3), the redshift distributions of LAEs, 
photo-z selected galaxies, and the stellar mass volume density 
of the photo-z galaxies smoothed to a large super-halo scale 
all show a significant peak at z ~ 2.44 (Figure 2). 

The top panel of Figure 2 shows the line of sight distri¬ 
bution of 51 LAEs from z = 2.0 to 3.6 in HPS-COSMOS. 
The dashed line shows the ensemble average redshift distribu¬ 
tion derived from the whole sample of LAEs in 4 HPS fields, 
smoothed with a Gaussian kernel of cr = 0.15 in redshift (nor¬ 
malized to indicate the expected number of LAEs per red¬ 
shift bin of 0.05 in the field of view of HPS-COSMOS). A 
concentration of nine LAEs in the bin at z = 2.45 is clearly 
seen. Their mean Lya redshift is 2.441, indicated by a long 
thick tick. The ensemble average number density of LAEs at 
this redshift is 4.0 x 10“"^ Mpc“^ h^. Within the redshift-space 
bin corresponding to a comoving volume of 8.5 x 12.0 x 43.5 
Mpc^ h“^, the ensemble average LAE number (Nlae) is L8. 
The LAE galaxy overdensity. 


<5lae = 


Nlae - (Nlae) 
(Nlae) 


(4) 


is ^ 4, averaged over this redshift-space bin. 

The density peak is unlikely to arise from a Poisson sam¬ 
pling of a spatially homogeneous density field, with a p-value 
of 2 X 10“^. Although it is well known that galaxies are clus¬ 
tered, it suggests that the peak is a genuine large-scale struc¬ 
ture of physical origin instead of a statistical fluctuation. The 
value of i5lae together with the moderately low LAE bias of 
^ 2 suggest a matter overdensity of ^ 2, implying that even 
at this large scale, the matter density field has already evolved 
to the nonlinear regime. Based on both the linear theory of 
spherical collapse (e.g.. Peacock 1999) and the observational 
signatures of cluster progenitors expected in ACDM cosmo¬ 
logical simulations (Chiang et al. 2013b), the overdensity of 
this structure at z = 2.44 appears more than sufficient for it to 
collapse and evolve into a cluster (> lO^"^ Mq) by z = 0. In 
§ 4 we will study the fate of the overdensity in more detail by 
comparing the observed LAE distribution with the mock LAE 
catalogs described above. 

The middle panel of Figure 2 displays the photo-z distri¬ 
bution of continuum-selected galaxies in the field of HPS- 
COSMOS (approximated by an 8.46' x 8.46' square region). 


The (Slab is scale dependent, thus needs to be interpreted carefully. 
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Figure 2. The redshift distributions of LAE number count (top), continuum- 
selected galaxy number count with photometric redshifts (middle), and the 
volume combined stellar mass derived from SED fittings of the continuum- 
selected galaxies (bottom) in the 71.6 arcmin^ HPS-COSMOS field. The 
typical photometric redshift en'or of individual continuum-selected galaxies 
is shown in error bar s. Each redshift bin of a width of 0.05 corresponds to a 
comoving volume of ~ 8.5 X 12.0 X 43.5 Mpc^ h“^ (at z=2.5). Dashed lines 
indicate the ensemble averages per redshift bin for each quantity, respectively. 
The gray regions in the middle and bottom panels show the 68% scatter per 
redshift bin for each quantity (only scatter for K, < 23.4 is shown in the mid¬ 
dle panel) estimated by randomly sampling the whole ~ 1.6 deg^ COSMOS 
field. The long and short thick ticks indicate the redshifts of the HPS proto¬ 
cluster and a proto-cluster found in the Z-EOURGE survey (Spitler et al. 
2012, see the Appendix), respectively. 

The “bright” galaxy sample with < 22.0 is shown by the 
yellow histogram (right y-axis), and the whole < 23.4 sam¬ 
ple is represented by the black hatched histogram. The typi¬ 
cal photo-z error of cr^ = 0.03 (1 +z) at z = 2.5 is indicated in 
the figure legend. The dashed line and gray shaded region 
are the median and 16/84 percentile scatter of the number 


counts of Ks < 23.4 galaxies as a function of redshift, cal¬ 
culated by randomly sampling the whole COSMOS field. Al¬ 
though not shown here, the median redshift distribution for 
the bright sample of < 22.0 would differ slightly, and the 
scatter would be larger than the gray region for < 23.4 
galaxies due to both a larger shot noise and a higher cos¬ 
mic variance (higher intrinsic clustering). Both the < 22.0 
and Ks < 23.4 galaxy number counts clearly reveal a density 
peak at 2.4 < Zphot < 2.5 coinciding with the highest LAE 
concentration in HPS. The overdensity appears to be more 
pronounced for bright/massive galaxies, which has been pre¬ 
viously seen in other massive proto-clusters (Steidel et al. 
2005). Chiang et al. (2014) compared the z ~ 2.45 density 
peak traced by the identical sample of < 23.4 galaxies 
with a large set of matched SAM lightcones (post-processed 
with observational selection effects and redshift errors), and 
found that even under this level of redshift uncertainties, the 
overdensity in photo-z galaxies suggests, with a ~ 70% con¬ 
fidence level, that this structure will evolve to a cluster with 
M,,,> > lO'"^ Mq by z = 0. We will see in § 4 that the LAE dis¬ 
tribution with precise redshifts provides consistent but much 
stronger constraints on the fate of the structure. 

The bottom panel of Eigure 2 shows, with the blue his¬ 
togram, the photo-z distribution of stellar mass combining 
all the continuum-selected galaxies (K^ < 23.4) within each 
redshift-space bin. The dashed line and gray region show 
the median and 16/84 percentile scatter of this distribution 
estimated by randomly sampling the whole COSMOS field. 
Similar to the previous case of galaxy number count, photo-z 
errors largely smooth out the fluctuation, and slightly reduce 
the (apparent) cosmic variance, which dominates the gray re¬ 
gion. A peak at 2.4 < Zphot < 2.5 is, again, clearly present. A 
stellar mass overdensity can be defined as 


where p* and (p*) are the stellar mass density calculated in 
a given window and the cosmic stellar mass density at the 
same epoch, respectively. The <5, of the most significant bin 
at 2.45 < Zphot < 2.5 is ~ 3, with a signal-to-noise ratio of 
> 4, which is much higher than that of the number counts of 
the same galaxy sample shown previously. This difference 
is related to the fact that there is a higher fractional excess 
of bright galaxies in this structure as shown previously. The 
inclusion of faint galaxies also plays a role in reducing the 
noise. The scatter shown with the gray region includes not 
only the cosmic variance but also the shot noise of galaxy 
counts and the systematics in SED fitting. 

3.2. Projected Spatial Distribution 

The z = 2.44 structure can be seen in the distribution of 
photo-z galaxies projected on the sky. The top panel of Eigure 
3 presents the overdensity map of continuum-selected galax¬ 
ies in the central 1.2 x 1.0 deg^ of COSMOS in a thin redshift 
slice centered at Zphot = 2.45. Dots represent galaxies with a 
Zphot within a full width of a^. This map was generated (but 
not shown) in the work of Chiang et al. (2014) to search for 
cluster progenitors. We have smoothed the galaxy distribu¬ 
tion with a scale of ^ 15 Mpc comoving that corresponds to 
the typical angular size of proto-clusters. Galaxy overdensity, 
5gai (as defined in Equation (4)), is calculated in a cylindrical 
window with a radius r = 5' and a redshift depth full width 
of lz = o'z = 0.025 (1 -l-z). Regions of local Sgai maxima were 
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Figure 3. Sky map of the galaxy distribution at z ~ 2.44 for the 1.2 X 1.0 deg^ COSMOS (top) and the zoomed in of the HPS-COSMOS field indicated by the 
black outline (bottom). The background color map in both panels shows the density of continuum-selected galaxies with photo-z {Ks < 23.4) smoothed with a 
cylindrical window of r = 5 and a depth 4 of (jj = 0.025(1 +z) as presented in Chiang et al. (2014). Dots in the top panel represent the galaxy sample used to 
calculate the large-scale density map within a photo-z full width of cr^, and additional ones within a photo-z full width of 2az are marked in the bottom panel with 
smaller symbols. In the bottom panel, stars indicates HPS LAEs. The diamonds indicate continuum-selected LBGs with spectroscopic redshift confirmed in the 
zCOSlVKDS survey and the observations in Diener et al. (2015). The dotted outline represents the Lya forest tomography field observed by Lee et al. (2014a). 


then identified, and compared with that in a set of matched 
SAM galaxy lightcones. The three overdense regions shown 
in red are strong candidate proto-clusters of Mr=o > 10'"^ Mq, 
with a confidence level of ~ 70%. The feature close to the 
field center corresponds to the HPS-COSMOS z = 2.44 struc¬ 
ture discussed in this work, where the HPS field is outlined in 
black.^^ This overdensity roughly fills the whole field of HPS- 

The other two, at least equally prominent photo-z overdensities in this 
map have their peak in redshift slices near but not in this slice, which 


COSMOS and extends few arcmins to the west. The size of 
this structure is on the order of 20 Mpc comoving, consistent 
with that of massive cluster progenitors studied in simulations 
(Suwa et al. 2006; Chiang et al. 2013b; Stark et al. 2014). 

In the bottom panel of Figure 3 we expand the scale to 

correspond to candidate proto-clusters PC17 (z = 2.42) and PC20 (z = 2.48), 
respectively, in Chiang et al. (2014). They are potentially more massive struc¬ 
tures, but the uncertainties in mass overdensity are much lai'ger than that of 
the HPS z = 2.44 structure with LAE redshifts presented in this paper. The 
confirmations of these two structures require spectroscopic follow-ups. 
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Figure 4. The line of sight velocity vios distribution of HPS-COSMOS LAEs 
centered at z = 2.441. Red and hatched elements indicate LAEs with an ex¬ 
tended Lya halo and X-ray counterpart, respectively. 

show the HPS-COSMOS field. The background 5gai map is 
the same as that shown in the top panel. The nine LAEs in 
the redshift spike presented in § 3.1 are indicated by red stars. 
Dots represent continuum-selected galaxies with a photomet¬ 
ric redshift within 2.45 ± cr^, and those within 2.45 ± 0.5tTj. are 
marked by larger symbols. 

3.3. Stellar Mass of the Continuum-selected Galaxies 

Continuum-selected galaxies (K^ < 23.4) inside the HPS- 
COSMOS field with 2.35 < Zphot < 2.5 have a median stellar 
mass of 4.5;!;3'5 x 10^° Mq (among a sample of 33), which is 
about double of that of 2 . 1);['3 x 10'° Mq for galaxies outside 
the overdensity with the same f^-band limit and redshift. 

3.4. Substructures 

Figure 4 shows the detailed line of sight velocity vios dis¬ 
tribution of HPS-COSMOS LAEs centered at z = 2.441 (the 
mean redshift of the nine LAEs in the overdensity). The nine 
LAEs span a full range of ^ 2500 km s“' in vios, with a disper¬ 
sion (Tvjos of 905 km s“' (using the gapper estimator for small 
N in Beers et al. 1990). Based on the large spatial extent of 
the structure on the projected sky, this high CTv.Ios is unlikely to 
be dominated by peculiar velocities of a collapsed structure. 
There appear to be two substructures labeled A and B in Fig¬ 
ure 4 (hereafter groups A and B, though the term “group” here 
does not refer to galaxies in a common parent halo). These 
substructures show a CTv.Ios of 456 and 221 km s“* for groups 
A and B, respectively, with a separation of ^ 1600 km s“' in 
their mean velocities. This separation corresponds to a line 
of sight comoving distance of 22.4 Mpc, which is larger than 
the HPS-COSMOS field size of 14.5 Mpc on the sky. Indeed, 
group A and B both have their members scattered across the 
entire HPS-COSMOS field on the projected sky. 

3.5. Other Evidence of the Structure in the Literature 

Using spectroscopic redshifts of continuum-selected galax¬ 
ies in the zCOSMOS-deep survey, Diener et al. (2013) iden¬ 
tified 42 “proto-groups” in COSMOS at 1.8 < z < 3.0. These 
systems were identified using a working definition of associ¬ 
ations of > 3 galaxies that pass a linking length criterion,'® 

The algorithm in Diener et al. (2013) is designed to identify groups or 
group progenitors, thus capturing overdensities with a scale smaller than that 


and are expected to each assemble into a single halo by z = 0. 
Strikingly, the richest structure (five galaxies, ID 22 in Diener 
et al. 2013) found in this large volume is located immediately 
west of the HPS structure at the same redshift of 2.44. It also 
coincides with the spatial extent of photo-z galaxy overdensity 
as shown previously in § 3.2 and Figure 3. Diener et al. (2015) 
spectroscopically confirmed a total of 11 galaxies (diamonds 
in the bottom panel of Figure 3) and gave a central redshift of 
2.45. They suggest that this structure will collapse to form a 
massive cluster of lO'"'- lO'® Mq by z = 0. With their spectro¬ 
scopic campaign in a wider field, this result strongly suggests 
that the HPS z = 2.44 structure is indeed large and associated 
with an extremely rare density concentration. 

Lee et al. (2014a) presented a 3-dimensional cosmic density 
reconstruction in a 5' x 11.8' field in COSMOS at 2.20 < z < 
2.45 using tomography of Lya absorption seen in the spec¬ 
tra of bright background galaxies. This field (dotted line in 
the bottom panel of Figure 3) coincides with the east half of 
the HPS-COSMOS field. As shown in the Figure 3 of Lee 
et al. (2014a), there is a strong and complex overdensity of 
Lya absorbing gas (the densest among the survey volume) at 
2.43 < z < 2.45, coinciding with our HPS LAE overdensity at 
z = 2.44. Their figure shows another three spectroscopically- 
confirmed, broad-band selected LBGs (from Lilly et al. 2007; 
Le Fevre et al. 2015) in this structure. This result inde¬ 
pendently supports the large-scale structure seen in HPS- 
COSMOS at z = 2.44. 

4. COSMIC EVOLUTION OE THE STRUCTURE 

We now examine the fate of the HPS-COSMOS large-scale- 
structure at z = 2.44 using the mock LAE catalogs constructed 
in §2.3. A large number of realizations of simulated HPS- 
COSMOS observations are generated. First, for each simu¬ 
lation box (500 Mpc h“' comoving) of the four mock LAE 
catalogs, we generate three projected pseudo-lightcones from 
the z = 2.4 snapshot with a viewing angle along the x, y, and z 
axes, respectively. Specifically, the apparent redshift of each 
LAE is determined by its line of sight position (for the compo¬ 
nent of the Hubble expansion) and peculiar velocity. Galaxy 
properties are non-evolving to focus on the comparison at 
z ~ 2.44. Second, we target each pseudo-lightcone with a 
large number of fields of 8.46' x 8.46' that match the area 
of HPS-COSMOS, each probing a pencil-beam like volume. 
Third, regions similar to the observed z = 2.44 overdensity are 
identified as mock structures. The main constraints provided 
by the observations are the level of LAE overdensity and their 
distribution along the line of sight, including the substructures 
described above. We define a set of criteria to identify mock 
structures in simulations: (1) there must be nine LAEs within 
a full span of 26.7 to 47.2 Mpc comoving along the line of 
sight, which correspond to the 2-a limits of that observed for 
the HPS-COSMOS overdensity,and (2) to match the more 
compact substructure of group A, six out of the nine LAEs 
are required to be in a line of sight interval within 23.6 Mpc 
comoving, the Icr upper limit of that observed. These criteria 

considered in this work for cluster progenitors. Their galaxy selection based 
on broad-band colors and limiting magnitudes {Ks <23.5, B < 25.3) typically 
excludes LAEs, which by definition, have a large excess of Lya with respect 
to the stellar continuum. 

Here the number count of LAEs is considered definite, as the effects of 
the shot noise on the proto-cluster characterization will be captured automat¬ 
ically by selecting a large realizations of mock structures. The uncertainty in 
the full span of the structure is estimated by bootstrapping the structure cen¬ 
tered distances of the nine observed LAEs, with an additional contribution 
from instrument error (cTyjos = 130 km s“^). 
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Table 3 

Properties of Mock Structures of LAEs at z = 2.4 


ID 

log 

^merged 

<1 

d 

-^los.app 

Pj e 

T^los.int 

f 

^v,los.app 

•^vloslnt^ 

A h 

v,los,app 

v,los,int 


[Mq] 


[arcmin] 

[Mpc] 

[Mpc] 

[km s“'] 

[km ] 

[km ] 

[km s“’ ] 

Mock I 

14.58ti^ 

3.24±L99 

5.69!5;21 

31-6211 

37.421:1 

76412* 

730+76 

^•7^-94 

30+1* 

771 +104 
^^i-92 

Mock II 

14 4Q+0 45 

3.23 ±1.92 

6.38!«4 

32.431“ 

37.69112 

7571“ 

22913 

2851“ 

202II 

Mock III 


3.49 ±1.96 

5.66!^:“ 

32.63l;g 

37 74+8-21 
-6.29 

■770 + 143 
' 

701 +109 
^■^^-87 

28+12 

204128 

Mock IV 

ih-.jz_035 

3.71 ± 1.99 

5.67!^:‘3 

32.061:1 

37 40+6-69 
- 6.12 

748!1‘ 

2301^ 

278!|» 

210+114 


“ Median Virial mass of the most massive z = 0 descendant DM halo (friends-of-friends group central) of a mock structure. 

Number of LAEs in each mock structure that will be merged into the same friend of friend group by z = 0. 

Angular separation between the field center targeting a mock structure and the true center of the corresponding proto-cluster (defined to be the center of mass 
of its member DM halos). 

Full size (comoving) of a mock structure of nine LAEs along the line of sight in redshift space. 

Eull size (comoving) of a mock structure of nine LAEs along the line of sight in real space. 

^ Line of sight velocity dispersion of a mock structure in redshift space. 

8 Line of sight velocity dispersion of a mock structure in real space (peculiar velocity only). 

*' Line of sight velocity dispersion of the main substructure (criterion 2 in the text) in redshift space. 

' Line of sight velocity dispersion of the main substructure (criterion 2 in the text) in real space (peculiar velocity only). 


select a few thousand mock structures per mock LAE cata¬ 
log. A fraction of the structures represent the same underlying 
structures seen under different viewing angles and/or covered 
by different realizations of the HPS-COSMOS pointing on the 
sky (i.e., different field centers). Finally, we examine the re¬ 
lation between these high-redshift mock LAE structures and 
their z = 0 descendant halos. 

Table 3 summarizes the main properties of mock HPS struc¬ 
tures at z = 2.4 and their descendants at z = 0. This structure 
has a large line of sight extent in redshift-space of Dios.app ~ 
32 Mpc comoving. Excluding the contribution from peculiar 
velocities, the simulations show that its real space full size 
Dios.int is ~ 38 Mpc, larger than its Dios.app- This is a clas¬ 
sic signature of the Kaiser effect (Kaiser 1987), suggesting 
that the outermost shell (as in the picture of spherical col¬ 
lapse scenario) has already decoupled from the cosmic expan¬ 
sion and started to collapse in comoving space. Most of the 
LAEs in the structure occupy a distinct DM halo during the 
observed epoch; these halos have already been influenced by 
self-gravity as an ensemble, and will combine to form larger 
halos in later epochs. The most massive z = 0 descendant halo 
of this structure is expected to have a virial mass of 
Mq (~ 90% probability with > lO'"^ Mq), correspond¬ 
ing to a massive galaxy cluster. Only > 3 LAEs in the struc¬ 
ture will be merged onto this main halo by z = 0, and are 
considered to be the true members of the proto-cluster. It is 
less certain whether the secondary substructure, group B, can 
evolve to a cluster-scale halo at z = 0. Since the whole struc¬ 
ture at z = 2.44 has already broken away from the Hubble flow, 
we expect a gravitationally bound, but not entirely virialized, 
descendant structure at z = 0 (with a size of several physical 
Mpc) containing a massive cluster. 

Although the structure is most likely to be a genuine proto¬ 
cluster with > lO'"^ Mq, there is a ~ 10% chance that 
the most massive z = 0 descendant halo will have a smaller 
virial mass of 10'^ ^-10^^ Mq. In this case the structure would 
be considered as a massive proto-group (e.g., Diener et al. 
2013). Such a slightly lower mass overdensity is often asso¬ 
ciated with cosmic web filaments, which have been studied 
in more details at lower redshifts (Sobral et al. 2013; Darvish 
et al. 2014; Hayashi et al. 2014; Sobral et al. 2015). 


The inferences of the mass overdensity and z = 0 virial mass 
would stay the same if we exclude X-ray detected LAEs and 
trace the structure using star-forming LAEs only. In this case 
the z = 2.44 overdensity consists of eight LAEs instead of 
nine, while a lower biased mock galaxy population (Mock I 
with b = 1.82) would be considered as fiducial to interpret the 
observation, resulting in a nearly identical level of inferred 
mass overdensity. We caution that our results would be bi¬ 
ased if the Lya escape fraction were to depend strongly on 
large-scale environment. However, a strong environment ef¬ 
fect would result in a galaxy two-point correlation function 
that significantly departs from the power-law form measured 
for typical star-forming galaxies and DM halos in simulations 
on relevant scales. Such a departure is not seen for observed 
LAEs (Gawiser et al. 2007; Kovac et al. 2007; Ouchi et al. 
2008, 2010; Guaita et al. 2010; Bielby et al. 2015). 

5. EXTENDED Lya HALOS AND AGNS 

In the top panel of Figure 2 and in Figure 4, we label the 
extended Lya sources in red. As described in § 2.1, these sys¬ 
tems are robustly ruled out to be point sources, with diame¬ 
ters of several tens of physical kpc (see the the Lya surface 
brightness profiles of the most extended sources in Adams 
et al. (2011)). Strikingly, an enhancement of extended LAEs 
in large-scale overdensities is present. Five out of six ex¬ 
tended LAEs in HPS-COSMOS are in large-scale overdense 
regions: four in our HPS-COSMOS structure at z = 2.44 
and another one in a z = 2.10 structure discovered in the 
ZFOURGE survey (Spitler et al. 2012; Yuan et al. 2014) with 
three LAEs detected in HPS-COSMOS (see the Appendix). 
The tendency for extended LAEs to be in overdense regions is 
highly significant against random fluctuations with a p-value 
of 3 X 10“^. Within the z = 2.44 structure, four out of the to¬ 
tal nine LAEs are extended. These four extended LAEs are 
distributed in both group A and group B (Figure 4), mak¬ 
ing this environment—Lya blobs correlation prominent at a 
scale at least equal or larger than a cluster progenitor, as we 
have shown that the z = 0 descendants of the whole z = 2.44 
structure will still be collapsing around a massive virialized 
cluster. Also, HPS-261 (see Table 4 in the appendix), the ex¬ 
tended LAE associated with the ZFOURGE z = 2.10 structure, 
is ~ 2.5 Mpc (physical) away from the well-confirmed den- 
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sity peaks of the proto-cluster (Yuan et al. 2014). Therefore, 
the properties of circumgalactic-scale Lya emission appear to 
be directly or indirectly connected to the elevated DM and 
baryon density on a super-halo scale. 

Four LAEs are detected in X-ray (hatched regions in the 
top panel of Figure 2 and Figure 4). Their X-ray luminos¬ 
ity {Lx ~ 10^“*- 10“*^ ergs“'), being ^ 3-20 times larger than 
their observed Lya luminosity, implies that AGN photoion¬ 
ization likely dominates the intrinsic Lya production in these 
systems. Two out of the four AGNs are in known large-scale 
overdensities: one in the z=2.10 structure (see the Appendix) 
and one in the z=2.44 structure. This result gives a moderately 
low p-value of 0.2 against the null hypothesis that AGNs are 
a random subset of the bulk LAE population drawn from an 
uniform probability distribution. 

Two out of the total six extended Lya sources in HPS- 
COSMOS are associated with X-ray detected AGNs. The sce¬ 
nario that extended Lya halos tend to host AGNs is significant 
against random fluctuation, with a p-value of 0.06. Further¬ 
more, these AGN powered Lya halos are all in overdense re¬ 
gions, implying possible causations behind the correlation of 
environment, AGNs, and the size of Lya emission. 

6 . PROTO-CLUSTERS IN THE HETDEX SURVEY 

Using the HET and VIRUS, the coming HETDEX survey 
will perform blind spectroscopy of order a million LAEs at 
1.9 < z < 3.5, allowing construction of cosmic density maps 
for galaxy environmental studies and select a large and ho¬ 
mogeneous sample of cluster progenitors. To precisely mea¬ 
sure the matter power spectrum at the peak scale of BAG 
for dark energy science, the main survey (300 deg^ Spring 
field plus 150 deg^ Fall field, hereafter, HETDEX-DEX) will 
sample the large area sparsely (Chiang et al. 2013a), with 
a 1/4.5 spatial filling factor (the fraction of sky area cov¬ 
ered by IFU fibers). This would impact, unfortunately, the 
performance of localized studies in real space through in¬ 
creasing shot noise. However, in a 28 deg^ area within the 
Fall field overlapping with the Spitzer-HETDEX Exploratory 
Large Area (SHELA; PI: Papovich) survey and other ancillary 
photometry (hereafter HETDEX-SHELA); complete cover¬ 
age (unity filling factor) will be achieved by multiple dither¬ 
ing. Here we examine the performance of proto-cluster iden¬ 
tification expected in HETDEX-DEX and HETDEX-SHELA 
with a counts in cell algorithm applied to our mock LAE cat¬ 
alogs. This analysis essentially uses the correlation between 
high-redshift local LAE overdensity (5 lae and the z = 0 de¬ 
scendant halo mass under the inclusion of observational 
effects and realistic noise. Implicitly, the input cosmology, 
gravitational structure formation, and galaxy formation model 
in the simulation together are used as the prior of the analysis. 
The difference in the survey filling factor of our two baseline 
fields here allows us to demonstrate the effects of a generic 
noise source in density mapping—the shot noise that arises 
from a discrete and finite sampling of the underlying parent 
distribution. 

Under the wavelength-dependent line sensitivity of HET¬ 
DEX and assuming a Lya luminosity function of Gronwall 
et al. (2007) for LAEs with no redshift evolution between 
1.9 < z < 3.5, the expected comoving number density of HET¬ 
DEX LAEs is nearly flat at ~ 8 x lO'^ Mpc-^ at 1.9 < z < 2.5, 
twice that as in HPS, and decreases to ~ 3 X 10“^ Mpc-3 
at z = 3.5. For HETDEX-DEX (1/4.5 filled) and HETDEX- 
SHELA (completely filled), we generate a mock LAE cata¬ 


log at z = 2.4, based on the LAE modeling described in § 2.3. 
These two catalogs have the same clustering properties as the 
Mock II used for characterizing the HPS z = 2.44 structure.'^ 
They have different ensemble average LAE number densi¬ 
ties, n HETDEX-SHELA ^ HhPS • H HETDEX-DEX = 1 • 1/2 ^ 1/4.5. 
We implement this feature by tuning for each mock the sur¬ 
vival probability described in § 2.3 to account for the uncer¬ 
tain stochasticity of Lya escape, plus, for the case of the 
HETDEX-DEX, the incompleteness due to a sub-unity sur¬ 
vey filling factor. 

We then perform a counts in cell analysis of LAE overden¬ 
sity in the mocks and examine its dependency on the z = 0 
descendant halo mass. A redshift-space cylindrical window 
of r = 6 Mpc h“' comoving and Uos = 20 Mpc h“' comoving 
(including peculiar velocity) is used to calculate local LAE 
number Nlae and overdensity Jlae- This window is ideal for 
the observed density contrast at z > 2 between proto-clusters 
of ~ 10'“* ^ Mq and field, while a more sophisticated 
optimization can be performed by varying the window with 
redshift, targeting M^=q, and the filling factor. Thus the perfor¬ 
mance of proto-cluster identification presented below should 
be viewed as a lower limit. 

The left panels of Figure 5 show, at z = 2.4, the expected 
probability distribution of (5lae globally (gray histograms) 
and that of the regions centered on proto-clusters with > 
10*4 ^ Mq (color histograms) in the simulation. The upper and 
bottom panels show the expected results for the HETDEX- 
SHELA and HETDEX-DEX surveys, respectively. These 
5lae distributions represent a same intrinsic correlation be¬ 
tween large-scale mass budget and their z = 0 collapsed mass 
modulated by different levels of shot noise, which fraction¬ 
ally scales with roughly the inverse square root of the true 
population mean number per window (characterized approx¬ 
imately by a Poisson process). In the case of HETDEX- 
SHELA, proto-clusters show a significantly higher Jlae com¬ 
pared to the ensemble, where a threshold in (5lae c™ be used 
to separate proto-cluster regions from field. In the case of the 
HETDEX-DEX, only proto-clusters with the highest (5lae can 
be separated, thus producing a much lower completeness. 

Since non-proto-cluster regions occupy the bulk of cosmic 
volume and can appear dense due to sampling noise and the 
intrinsic scatter (usually subdominant), it needs to be quan¬ 
tified how well the can be recovered given a measured 
^LAE- We show this correlation for each HETDEX baseline 
field in the right panels of Figure 5. is the virial mass 
of the most massive z = 0 descendant halo (friends-of-friends 
group central) of the LAEs within a window for measuring 
Slab- The dots and errorbars indicate, respectively, the me¬ 
dian and 16/84 percentile scatter of the at a given (5lae- 
For HETDEX-SHELA, the correlation is fairly 

tight. The scatter in shrinks from ~ 1.5 dex at (5 lae ^ 0 
to < 0.5 dex at 5 lae ~ 10, showing that the most massive 
proto-clusters, while being rare, can be identified robustly in 
HETDEX-SHELA. On the other hand, the larger shot noise 
(horizontal scatter in nature) in HETDEX-DEX not only ex¬ 
tends the range of possible (5lae (also shown in the left pan¬ 
els) but also increases the scatter of this correlation. The 
median M^=o-<^lae correlation in HETDEX-DEX lies every¬ 
where below that in HETDEX-SHELA. This result is due to 
the upward scatter from intrinsically less dense regions, which 

** The slightly deeper Lyo luminosity limit of HETDEX compared with 
that of HPS is expected to have only a limited effect on the bias of LAEs 
(Orsi et al. 2008). 
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Figure 5. Left panels: the probability distribution of LAE overdensity (Slab globally (gray histograms) and regions centered on proto-clusters with M—o > 

Mq (color histograms) in simulations for the 28 deg^ HETDEX-SHELA (top) and the ~ 450 deg^ dark energy survey of HETDEX (bottom) where 1/4.5 of the 
area will be covered by IFU fibers. The (Slab is measured in a cylindrical window of r = 6 Mpc h“* comoving and = 20 Mpc h“* comoving. Right panels: 
median and 16/84 percentile scatter of the z = 0 descendant halo mass M—o as a function of (5 lae for each HETDEX baseline fields, evaluated by sampling the 
whole volume of the simulations randomly. 


outweighs the downward scatter because of the much higher 
abundance of the former, thus the estimated for a gen¬ 
uine dense structure is biased low when the noise is finite. 
If the structure shows other evidence of overdensity like in 
the case of the HPS structure at z = 2.44, a deeper Lycr ob¬ 
servational program is likely to increase the best estimated 
M^=o, and asymptotically approach the true value when hav¬ 
ing a large N. 

For M,=o > 10*^ Mq proto-clusters and a required purity 
of 70%, 80%, 90%, the completeness in HETDEX-SHELA 
is ^ 50%, 30%, 15%, respectively; in the case of HETDEX- 
DEX, the completeness decreases to 5%, 1%, and nearly 0%, 
respectively, as the lower scatter in the bottom-right panel of 
Eigure 5 never reaches much above 10'^^ Mq. These estimates 
represent the minimum performance. An ideal strategy for the 
case of the wide HETDEX-DEX would be focusing on find¬ 
ing the largest and rarest proto-clusters, where an even larger 
window can be beneficial since these structures remain over- 
dense on a large scale. A large window is also preferred for a 
statistical reason—the shot noise, which roughly scales with 
the volume of the window to the -3/2 power, can be reduced. 
Unfortunately in this case the accuracy of the positional cen¬ 
tering and the handle of substructure remain poor. Additional 
investigations of these densest structures in HETDEX-DEX 
are needed to calculate their exact overdensity, and would sup¬ 
plement a massive sample to that found in HETDEX-SHELA. 

Conservatively, we expect to obtain a sample (>90% con¬ 
fidence) of a few tens of M^=o > 10'^ Mq proto-clusters and 
a few hundreds of M© in HETDEX-SHELA at 


1.9 < z < 3.5; and another hundred M^^q 10^^ M© proto¬ 
clusters in HETDEX-DEX. 

7. DISCUSSION 

Here we focus our discussion on proto-cluster identification 
quantified in terms of M^^, the comparison between the HPS 
structure and other known high-redshift overdensities in the 
literature, and the dependency of galaxy properties on large- 
scale environment. 

In §4, we identified regions that match the HPS z = 2.44 
structure in the four mock LAE catalogs (§ 2.3) of different 
clustering within the uncertainty of that observed. These four 
mocks essentially yield the same prediction on the z = 0 de¬ 
scendant cluster mass of Mq. A key reason for this 

result is that the abundance of z = 0 clusters, quantified by 
the z = 0 halo mass function, posts a strong prior in determin¬ 
ing the fate of the observed high-z overdensity (especially in 
the massive end). A higher LAE overdensity than that of the 
HPS z = 2.44 structure would not increase the inferred 
substantially; instead, it might pose challenges to the concor¬ 
dance cosmology (in particular, and a$) as the probability 
of finding such a density peak will be extremely low. Eor 
Mock I to IV, only 3% to 9% of the HPS-COSMOS realiza¬ 
tions (of the same survey volume) have a region that meets 
the mock structure criteria. Therefore our discovery of this 
dense z = 2.44 structure is perhaps a great coincidence. How¬ 
ever, the cluster interpretation is consistent with the “proto¬ 
group” study in the zCOSMOS-deep survey (Diener et al. 
2013) that no larger structure is found as traced by their spec- 
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Figure 6. Line of sight velocity distribution of the LAEs in the HPS z = 2.44 structure compared with that of proto-clusters in the literature around powerful 
radio galaxies (RG; Kurk et al. 2000, 2004; Pentericci et al. 2000; Venemans et al. 2002, 2005, 2007) and the SSA22 field (Steidel et al. 1998, 2000; Hayashino 
et al. 2004; Matsuda et al. 2005; Yamada et al. 2012a). The histograms are normalized to the same scale of surface number density in an arbitrary unit of number 
per comoving area (left y-axis). Red histograms represent LAEs above the Lya luminosity limit of HPS, and gray histograms show, in the case of radio galaxy 
proto-clusters, LAEs with deeper Lya luminosity limits indicated in the figure legend. Slightly different EWLya criteria of > 20 A and > 15 A were adopted in 
the selections of LAEs in HPS and RG fields, respectively; while a stricter criterion of EWLya > 40 A was adopted in the case of SSA22. 


troscopic sample in the field several times larger than that of 
the HPS-COSMOS. Our results also imply that an LAE bias 
of much lower than 2 (thus a higher inferred mass density) 
would produce a conflict with the concordance cosmology. 
Similar problems would be raised if the clustering of LAEs 


was not modeled (see §2.3) to have a realistically large cos¬ 
mic variance as constrained by their wide stellar mass distri¬ 
bution. 

In § 3.1, we showed that the HPS z = 2.44 structure has a 
higher signal-to-noise ratio in stellar mass overdensity than 
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in galaxy number counts using the same set of continuum- 
selected galaxies with photo-z. In § 3.3, we demonstrated that 
the stellar mass of galaxies in the overdensity is about twice 
that outside the overdensity. These results suggest that the 
onset of star formation in this structure occurred at signifi¬ 
cantly earlier epochs, supporting the picture of the “cosmic 
downsizing.” The result agrees quantitatively well with that 
found in a proto-cluster in the quasar HS1700 h- 643 field at 
z = 2.30 (Steidel et al. 2005), and is consistent with the high 
formation redshifts inferred from stellar population synthesis 
of the low-redshift cluster red sequence (e.g., Rettura et al. 
2010). It remains to be tested, with a future large sample of 
proto-clusters, whether the stellar mass excess is generic for 
high-redshift overdensities. 

The cumulative star formation (stellar mass) in a region 
is a direct consequence of the past accretion and cooling of 
baryons triggered by the gravitational field of the total mat¬ 
ter enclosed, whereas the density contrast in terms of pure 
number counts in a dense region can be reduced by galaxy 
mergers as structure/galaxy formation progresses. Therefore 
it is expected that the stellar mass density field traces the un¬ 
derlying matter density field more tightly. We suggest that in 
the case of photometric surveys with or without subsequent 
spectroscopy (where stellar mass can be better measured than 
in emission-line galaxy surveys), an analysis of stellar mass 
density contrast should ideally replace galaxy number counts 
as a standard technique to (1) define galaxy environment and 
identify possible environmental effects, (2) recover the under¬ 
lying matter field, and (3) identify proto-clusters and predict 
their For most photometric surveys, the resources re¬ 

quired for measuring stellar mass do not exceed that for mea¬ 
suring photometric redshift to a sufficient accuracy. Thus a 
boost of performance for the aforementioned applications can 
be expected, at no extra cost. 

The HPS z = 2.44 structure does not have a significantly 
high level of total instantaneous SFR estimated by SED fitting 
of the continuum-selected galaxies. This result is in line with 
the general understanding that galaxy star formation is consid¬ 
erably bursty and could be triggered by sporadic and instanta¬ 
neous accretion of cold streams from the cosmic web (Dekel 
et al. 2009b) or violent disk instabilities (Dekel et al. 2009a; 
Overzier et al. 2009). Therefore a measure of the large-scale 
SFR density field would be quite noisy compared to that of 
the stellar mass. 

We compare the HPS z = 2.44 structure studied in this work 
with other proto-clusters in the literature. Figure 6 shows 
the line of sight velocity distribution of LAEs in the HPS 
z = 2.44 structure (same with Eigure 4), five previously known 
LAE overdensities around powerful radio galaxies (Kurk et al. 
2000, 2004; Pentericci et al. 2000; Venemans et al. 2002, 
2005, 2007), and the structure in the SSA22 field at z = 3.08 
(Steidel et al. 1998, 2000; Hayashino et al. 2004; Matsuda 
et al. 2005; Yamada et al. 2012a). These comparison struc¬ 
tures were observed with deep narrow-band imaging at the 
wavelength of Lya of the radio galaxies or, in the case of 
SSA22, of a serendipitously discovered overdensity in a red- 
shift survey of continuum-selected galaxies. A slightly more 
relaxed EWlyq criterion of > 15 A (compared to the > 20 
A used in HPS) was adopted in the LAE selection in the ra¬ 
dio galaxy fields, while a stricter criterion of EWlyq > 40 A 
was used in the case of SSA22. These narrow-band selected 
LAEs were then investigated by slit spectroscopy, revealing 
a redshift-space concentration of a few tens of LAEs for each 


(gray histograms). Similar to the HPS structure, the structures 
around radio galaxies are expected to each evolve to a galaxy 
cluster of several times 10^"^ Mq by z = 0 based on the level of 
LAE overdensity observed; the overdensity of the continuum- 
selected LBGs in the SSA22 field suggests a slightly higher 
z = 0 cluster mass of ~ 10'^ Mq, (see summaries and discus¬ 
sion in Steidel et al. 1998; Venemans et al. 2007; Chiang et al. 
2013b). The detection limit in terms of the Lya luminosity 
for the HPS z = 2.44 structure is relatively shallow compared 
to these comparison proto-cluster fields. Taking this limiting 
luminosity into account, the HPS structure shows a LAE ex¬ 
cess that is comparable to all the comparison structures (ex¬ 
cept for PKS 1138-262, which lacks very bright LAEs). In 
fact, the comoving number density of LAEs in the HPS struc¬ 
ture is higher than that of all the radio galaxy structures, and 
similar to that of the SSA22 structure if observed down to 
the same HPS depth (red histograms).'® Thus a large popu¬ 
lation of faint LAEs might exist for the HPS z = 2.44 struc¬ 
ture, requiring deeper observations to confirm. Similarly, the 
ZEOURGE z = 2.10 proto-cluster (see the Appendix) might 
also hosts a population of faint LAEs yet to be observed. The 
HPS z = 2.44 and radio galaxy structures, all having a sim¬ 
ilar end point in terms of z = 0 cluster mass, can provide a 
rough evolutionary picture of early cluster kinematics across 
cosmic time. In general, the line of sight velocity disper¬ 
sion of proto-clusters increases from < 300 km s“' for TN 
J1338-1942 at z = 4.11 to ~ 900 km s“* for the three struc¬ 
tures at z ~ 2.5 (PKS 1138-262, MRC 0052-241, and MRC 
0943-242). However, this latter velocity dispersion might be 
too large for the structures to collapse entirely by z = 0 (see 
§4), and perhaps by coincidence, these three overdensities 
around radio galaxies all show a bimodal velocity structure. 
Conversely, the three higher redshift radio galaxy structures 
(MRC 0943-242, MRC 0316-257, and TN J1338-1942) show 
a more clear central concentration in velocity space. The per¬ 
haps slightly more massive proto-cluster at z = 3.08 in the 
SSA22 field shows a double-peak profile of LAE line of sight 
velocity distribution, with a combined dispersion of ~ 1000 
km s“'. Such a velocity distribution suggests, again, that the 
large structure in the SSA22 field is unlikely to collapse en¬ 
tirely by z = 0. A massive descendant cluster of ~ lO'^ Mq 
connected with dense filaments, or a pair of slightly lower 
mass clusters are expected at z = 0. In conclusion, this com¬ 
parison demonstrates that proto-clusters, though they can be 
characterized with to first order, show a wide variety of 
topology in the phase-space mass distribution. A larger sam¬ 
ple of proto-clusters with different mass and topology at dif¬ 
ferent redshifts is needed for detailed investigations. 

In § 5 we demonstrated that there is a boost of extended 
LAEs and a marginal excess of AGNs in large-scale over¬ 
densities. Under the scenario of resonant scattering of Lya 
in the circumgalactic medium (CGM), the production of 
Lya photons and the phase-space distribution of the CGM 
around galaxies together determine the spatial profile of Lya 
emission (Laursen & Sommer-Larsen 2007; Laursen et al. 
2009a,b; Zheng et al. 2011; Dijkstra & Kramer 2012; Ver- 
hamme et al. 2012). We speculate that the super-halo scale 
galaxy environment might be connected to the excess of ex¬ 
tended Lya halos through (1) triggering AGNs (thus elevating 

** To compare observations with different field of view and at different 
redshifts, we normalize the histograms to the same scale of surface number 
density in an arbitrary unit of inverse comoving area (left y-axis). 
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the production of Lya photons) for luminous LAEs and (2) al¬ 
tering the CGM prohle and/or inflow/outflow structures of all 
galaxies in general. These speculations are supported by the 
correlations of extended LAEs and environment in our HPS 
data, and also various reports in the literature. 

Eirst, the excess of AGNs in high-redshift large-scale over¬ 
densities is found in other systems (Pentericci et al. 2002; 
Lehmer et al. 2009; Martini et al. 2013), possibly triggered 
by efficient gas accretion and subsequent funneling induced 
by frequent galaxy interactions. Second, nearly all the lu¬ 
minous Lya blobs > 10"^^'^ergs”') show signatures of 
obscured quasars (Overzier et al. 2013), and many of them 
are located in dense environments. Third, while extremely 
deep narrow-band imaging suggests that halos of scattered 
Lya are a generic feature of typical high-redshift star-forming 
galaxies (Steidel et al. 2011; Momose et al. 2014), the scale 
lengths of the Lya radial profile are small in the field (Leld- 
meier et al. 2013) and significantly elevated as super-halo 
scale galaxy densities increase (Matsuda et al. 2012). The 
Matsuda et al. (2012) comparison was done while control¬ 
ling for the UV magnitude, which traces the ionizing pho¬ 
tons generated by young stars. This correlation thus needs to 
be explained beyond the amount of intrinsic Lya production, 
possibly through a correlation between environment and the 
phase-space distribution of CGM. 

The density gradient of the CGM of galaxies in dense en¬ 
vironments might be flattened, as the baryons follow the two- 
halo term clustering of DM halos at this scale (Zheng et al. 
2011). Lurthermore, the CGM dynamics and inflow/outflow 
structures might be affected, manifested in a shorter fallback 
time scale of the galactic wind launched by galaxies in dense 
environments (Oppenheimer & Dave 2008; Dave et al. 2011). 
This effect is expected to be more prominent for low-mass 
galaxies, where the local gravitational potential is not deep 
enough to govern entirely the galaxies’ baryonic cycle. It re¬ 
mains to be explored how efficient this mechanism can be on 
the scales exceeding a single DM halo, and how CGM dynam¬ 
ics is connected to the resonant scattering and escape of Lya. 
In the 28 deg^ HETDEX-SHELA survey, where a complete 
coverage will be achieved by dithering, Lya blobs extended 
significantly beyond a fiber (1.5" diameter, ~ 12 physical kpc 
at z = 2.5, 1/3 of that of HPS) can be identihed. The correla¬ 
tion between diffuse Lya halos and galaxy environment thus 
can be quantihed with high quality statistics. 

In § 6 we presented the expected performance of proto¬ 
cluster identihcation in HETDEX. As shown in § 3 and § 4, 
the nine LAEs in HPS provide a much more stringent con¬ 
straint on the M;.=o of the z = 2.44 structure than that of 
the > 30 continuum-selected galaxies with photo-z derived 
from ~ 30 bands in COSMOS/UltraVISTA. Although the 
technique of stellar mass overdensity enhances the signal 
of the underlying matter density held, the photo-z errors of 
continuum-selected galaxies signihcantly increase the noise 
level (see Ligure 13 in Chiang et al. 2013b). The key advan¬ 
tage of emission-line galaxy surveys in proto-cluster searches 
is the ability to measure precise galaxy redshifts efficiently, 
thus reducing the effects of line of sight projection. How¬ 
ever, broad-band imaging with a wide range of wavelength 
coverage is still crucial for galaxy population studies. The 28 
deg^ HETDEX-SHELA held will be extremely valuable on 
this subject for its LAE redshifts and deep optical to far-IR 
photometry (m, g, r, i, z, X, Spitzer-3.6, 4.5 pm, Herschel- 
250, 350, and 500 pm). 


8 . CONCLUSION 

Galaxy proto-clusters at z > 2 can be found and conhrmed 
efficiently in large emission-line galaxy surveys. In this pa¬ 
per, we presented the discovery and a detailed characteriza¬ 
tion of a large-scale structure containing a proto-cluster at 
z = 2.44 traced by LAEs in the HETDEX Pilot Survey. The 
same structure is also seen in continuum-selected photometric 
redshift catalogs, and appears as a signihcant overdensity in 
stellar mass density and gas absorption maps. We constructed 
a set of mock LAE catalogs matching the clustering properties 
of the observed LAEs and examined the fate of this HPS struc¬ 
ture, and demonstrated the expected performance of proto¬ 
cluster identihcation in the full HETDEX survey, which will 
conhrm a large number of structures similar to the one studied 
here. 

• The HPS, which performed a LAE survey of ~ 7' X 10' in 
COSMOS at 1.9 < z < 3.8, discovered a prominent density 
concentration of nine bright LAEs at z = 2.44. With the pho¬ 
tometric redshift galaxy catalog of COSMOS/UltraVISTA, 
we demonstrated that this structure is also seen in over¬ 
densities of continuum-selected galaxies in both number 
counts and volume-specihc stellar mass. The structure ex¬ 
tends > 30 Mpc comoving along the line of sight with 
two subgroups of LAEs, and a ~ 20 Mpc comoving on 
the sky revealed by the continuum-selected galaxies. Us¬ 
ing the zCOSMOS survey and additional spectroscopy, Di- 
ener et al. (2013, 2015) identihed and conhrmed a galaxy 
overdensity adjacent to the HPS-COSMOS held at z = 2.45, 
which appears connected to the HPS structure presented in 
this paper. We hnd other independent evidence of this struc¬ 
ture in the literature, including an excess of Lya absorbing 
gas (Lee et al. 2014a). 

• To compare the HPS structure with simulations of cosmic 

structure formation, we constructed a set of mock LAE cat¬ 
alogs from the SAM of Guo et al. (2013). The LAEs were 
modeled based on the Lya production by young stars and an 
empirical treatment of the escape of Lya in dusty ISM. The 
modeling self-consistently reproduces the observed LAE 
galaxy bias and stellar mass distribution. Regions in the 
mocks as dense as the HPS z = 2.44 structure are then iden¬ 
tihed and tracked to z = 0. The HPS structure, although 
spanning a few tens of Mpc comoving, should have already 
broken away from the Hubble how. Part of the structure 
will collapse to form a galaxy cluster with ^ Mq by 

z = 0. 

• Pour of the nine LAEs in the HPS structure are signihcantly 
extended in Lya emission, and one of them shows an AGN 
signature in X-ray (is also an extended Lya source). We 
speculate that a super-halo scale dense environment might 
facilitate AGN activities and alter the CGM prohles around 
high-redshift star-forming galaxies, boosting the spatial ex¬ 
tent of Lya. The median stellar mass of the continuum- 
selected galaxies in the HPS structure is about twice that of 
the held counterparts. These results demonstrate an accel¬ 
erated co-evolution of massive galaxies and their supermas- 
sive black holes in overdense environments. 

• Linally, we predict the performance of proto-cluster iden¬ 
tihcation in the coming HETDEX survey, which will ob¬ 
serve of order a million LAEs at 1.9 < z < 3.5. In the full 
~ 450 deg^ HETDEX, where 1/4.5 of the sky will be cov¬ 
ered by IFU hbers, we expect to conhrm at least a hundred 
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massive proto-clusters with ~ 10*^ Mq. In the 28 
deg^ HETDEX-SHELA field, where a complete sky cov¬ 
erage will be performed, we expect to obtain a few tens of 
proto-clusters with > lO'^ Mq, and a few hundreds of 
Mj.=o > 10^"^'^ Mq. Together with a rich set of ancillary pho¬ 
tometry, the HETDEX-SHELA field will provide a power¬ 
ful data set to study the rapid mass assembly and galaxy 
growth of present day massive clusters in their formation 
epoch. 
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APRENDIX 

ANOTHER STRUCTURE AT AT Z=2.10 

HPS-COSMOS also partially covers another proto-cluster at z = 2.10 (indicated by a short thick tick in Figure 2) discovered 
by galaxy overdensities in a deep medium-bands photometric survey of ZFOURGE (Spitler et al. 2012). Three cores of possibly 
virialized halos of > lO'^ Mq at the observed epoch are identified. In Chiang et al. (2014), we recovered this structure on a 
scale of ~ 15 Mpc comoving using the same COSMOS/UltraVISTA photometric redshift galaxy catalog used here, and together 
revealed other 35 candidate proto-clusters in COSMOS field. Recently, Yuan et al. (2014) has performed a large spectroscopic 
campaign and confirmed more than 50 objects in this structure, estimating a redshift zero virial mass of ^ Mq. 

Within the region of the three cores, we do not detect any LAE in HPS, but we did find three LAEs (summarized in Table 4, 
see also Eigure 2) associated with this z = 2.10 proto-cluster several arcmins away from the cores, indicating that the overdensity 
of this structure indeed has a large spatial extent, consistent with what was reported in Chiang et al. (2014) and the theoretical 
expectation of a forming cluster (Chiang et al. 2013b). In the overdensity/field comparisons of galaxy populations in § 5 and § 7, 
we have considered these three z = 2.1 LAEs located in large-scale overdensity. 
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Table 4 

HPS-COSMOS Lya Emitter Catalog in the z = 2.10 structure 


HPS 

z" 

a 

(5 

Flux 

L 

Spectral 

Spatial 

Counter¬ 

Counter¬ 

EWrest" 

Fluxx-ray 

Index 

(Lyo) 

(J2000) 

(J2000) 

(Lyo) 

(Lya) 

FWHM*’ 

FWHM^ 

part mR 

part Prob.^^ 

(Lyo) 

(0.5-10 keV) 



[deg] 

[deg] 

[10-” cgs] 

[10^^ cgs] 

[km s-'] 

[arcsec] 

[mag] 


[A] 

[10-” cgs] 

244 

2.0996 

150.09858 

2.22000 

10.4^;2 

3 5+*'* 

114 

3.5!|:5 

26.02 

0.25 



261 

2.0960 

150.11904 

2.29678 

143 7+23.2 

48.4”'* 

886 


23.76 

0.87 

536.7”57^8 

2040±125 

313 

2.0975 

150.16992 

2.30656 

25 1+12.4 

8.5!^i 

249 

5.o!|:« 

22.75 

0.98 

q+12.3 

^•^•^-9.7 



^ With an uncertainty of 4 x 10“^ based on a 0.5 A line center uncertainty. 

^ After deconvolution with a 5 A FWHM instrumental resolution {ainst 130 km s“^). 

Including a tophat component of the fiber size of 4''.235 and the effects of dither pattern and discrete sampling. 
^ Probability of counterpart association (/?-band). 

^ Based on an interpolation between the two nearest filters for continuum. 



